Top 200 Data Engineer Interview Questions & Answers by Knowledge Powerhouse

Top 200 Data Engineer Interview Questions & Answers by Knowledge Powerhouse

Author:Knowledge Powerhouse [Powerhouse, Knowledge]
Language: eng
Format: epub
Published: 2017-03-19T07:00:00+00:00


E.g. CREATE TABLE tableName (column1 STRING, column2 STRING) SKEWED BY (column1) on (‘value1’)

During queries, we get better performance in Hive with SKEWED tables.

90. What is the use of CLUSTERED BY clause during table creation in Hive?

CLUSTERED BY in Hive is same as DISTRIBUTE BY and SORT BY. When we specify CLUSTERED BY, it will first distribute the data into different reducers by using a Hash. Once data is distributed, it will sort the data.

We have to specify CLUSTERED BY clause during table creation. But it is useful in querying of data in Hive.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.